182 research outputs found

    A walk on Python-igraph

    Get PDF
    [ES]Breve tutorial de la biblioteca y aplicación para programadores i-graph

    A walk on Python-igraph

    Get PDF
    Brief tutorial of the i-graph library, a tool for programmers

    Retrieval of bilingual Spanish-English information by means of a standard automatic translation system

    Get PDF
    This paper describes our participation in bilingual retrieval (queries in Spanish on documents in English), by means of an information retrieval system based on the vector model. The queries, formulated in Spanish, were translated into English by means of a commercial automatic translation system; the terms extracted from the resulting translations were filtered in order to get rid of empty words and then they were normalised by stemming. Results are poorer than those obtained through monolingual retrieval with the original queries in English slightly above 15%

    Automatic Classification of Documents. A Case Study

    Get PDF
    [ES]La clasificación de documentos consume gran cantidad de trabajo y puede llegar a ser impracticable si la cantidad de documentos es elevada. Cuando los documentos son digitales, es posible aplicar técnicas de clasificación automática. Los sistemas de clasificación automática de tipo supervisado son capaces de identificar la clase o categoría adecuada para un documento determinado, después de una fase de aprendizaje o entrenamiento, durante la cual el sistema aprende las características que definen las diferentes categorías. Se describen algunas de las técnicas más utilizadas, como los clasificadores bayesianos, así como los diferentes ajustes que pueden ser efectuados para mejorar su efectividad. Se describe una aplicación de tales técnicas en un caso real, se analizan los detalles de la implementación y se discuten los resultados.[EN]Classification of documents consumes a great amount of work and may become impractical if the number of documents is high. When documents are in digital format, one can apply automatic techniques of classification. The so called supervised automatic classification systems are able to identify the category or class to which a document must be assigned. This is achieved by means a training process, in which the system learns the key features of every class. We describe some of most used techniques, as the Bayes based classifiers, as well as the issues that we can adjust to improve their effectivity. We also describe their practical use in a real case, we analyze their implementation and results are discusse

    Web Page Retrieval by Combining Evidence

    Get PDF
    The participation of the REINA Research Group in WebCLEF 2005 focused in the monolingual mixed task. Queries or topics are of two types: named and home pages. For both, we first perform a search by thematic contents; for the same query, we do a search in several elements of information from every page (title, some meta tags, anchor text) and then we combine the results. For queries about home pages, we try to detect using a method based in some keywords and their patterns of use. After, a re-rank of the results of the thematic contents retrieval is performed, based on Page-Rank and Centrality coeficients

    The implications of Wikipedia for contemporary science education: Using Social Network Analysis Techniques for Automatic Organisation of Knowledge

    Get PDF
    Wikipedia is an Open Content resource, which is constructed by a users community, and is widely employed in educational contexts by both students and teachers. Wikipedia articles have hyperlinks that connect them, so it is possible to represent Wikipedia as a network, in which the nodes are the articles and the edges are hyperlinks. In this paper we analyze a complete copy of the Spanish Wikipedia. We apply Social Networks Analysis Techniques and, more precisely, Communities Detection Techniques, in order to identify clusters of articles with similar content. As the number of clusters is relatively small we use manual analyses to detect science articles. In addition we identify the most representative scientific fields and their main features. We conclude that science articles are about 11.66 % of Spanish Wikipedia articles and that the most important clusters of scientific articles do not always coincide with classical Science disciplines. This kind of analyses contributes to understanding better Wikipedia as an educational tool

    Análisis cibermétrico y visual de Twitter

    Get PDF
    This paper try to solve the necessity of collect the profile, followers and followed of a Twitter user via API and develop a crawler application use the library Python-Twitter, with the aim of make an analysis and visualization of the Twitter users network

    The implications of Wikipedia for contemporary science education: Using Social Network Analysis Techniques for Automatic Organisation of Knowledge

    Get PDF
    [EN]Wikipedia is an Open Content resource, which is constructed by a users community, and is widely employed in educational contexts by both students and teachers. Wikipedia articles have hyperlinks that connect them, so it is possible to represent Wikipedia as a network, in which the nodes are the articles and the edges are hyperlinks. In this paper we analyze a complete copy of the Spanish Wikipedia. We apply Social Networks Analysis Techniques and, more precisely, Communities Detection Techniques, in order to identify clusters of articles with similar content. As the number of clusters is relatively small we use manual analyses to detect science articles. In addition we identify the most representative scientific fields and their main features. We conclude that science articles are about 11.66 % of Spanish Wikipedia articles and that the most important clusters of scientific articles do not always coincide with classical Science disciplines. This kind of analyses contributes to understanding better Wikipedia as an educational tool

    La cibermetría en la recuperación de información en el Web

    Get PDF
    The exponential growth of web and distributed data characteristics, high volatility, unstructured data, redundant and highly heterogeneous, have introduced new problems in information retrieval processes. Therefore it is necessary to open new avenue of research that allow us to obtain good levels of accuracy. The papers are based on exploiting the hypertext features of the site is reaching great fame. The cybermetrics is providing many options for working with links and is offering some interesting options at this time, and much of the techniques used in the same may be useful in the processes of information retrieval on the web

    Science and Technology in Social Networks: Twitter

    Get PDF
    [ES]El uso de Internet como fuente principal de búsqueda de información científica se ve reforzado con el uso de los redes sociales. Esta situación requiere un estudio y estandarización del contenido obtenido por esta vía. A través del estudio de los perfiles de twitter que difunden información científica se pueden identificar los temas principales y la cantidad y calidad de la información científica compartida
    corecore